57 research outputs found

    Ordering-sensitive and Semantic-aware Topic Modeling

    Full text link
    Topic modeling of textual corpora is an important and challenging problem. In most previous work, the "bag-of-words" assumption is usually made which ignores the ordering of words. This assumption simplifies the computation, but it unrealistically loses the ordering information and the semantic of words in the context. In this paper, we present a Gaussian Mixture Neural Topic Model (GMNTM) which incorporates both the ordering of words and the semantic meaning of sentences into topic modeling. Specifically, we represent each topic as a cluster of multi-dimensional vectors and embed the corpus into a collection of vectors generated by the Gaussian mixture model. Each word is affected not only by its topic, but also by the embedding vector of its surrounding words and the context. The Gaussian mixture components and the topic of documents, sentences and words can be learnt jointly. Extensive experiments show that our model can learn better topics and more accurate word distributions for each topic. Quantitatively, comparing to state-of-the-art topic modeling approaches, GMNTM obtains significantly better performance in terms of perplexity, retrieval accuracy and classification accuracy.Comment: To appear in proceedings of AAAI 201

    Multivariate analysis based on the maximum standard unit value of 18F-fluorodeoxyglucose positron emission tomography/computed tomography and computed tomography features for preoperative predicting of visceral pleural invasion in patients with subpleural clinical stage IA peripheral lung adenocarcinoma

    Get PDF
    PURPOSEPreoperative prediction of visceral pleural invasion (VPI) is important because it enables thoracic surgeons to choose appropriate surgical plans. This study aimed to develop and validate a multivariate logistic regression model incorporating the maximum standardized uptake value (SUVmax) and valuable computed tomography (CT) signs for the non-invasive prediction of VPI status in subpleural clinical stage IA lung adenocarcinoma patients before surgery.METHODSA total of 140 patients with subpleural clinical stage IA peripheral lung adenocarcinoma were recruited and divided into a training set (n = 98) and a validation set (n = 42), according to the positron emission tomography/CT examination temporal sequence, with a 7:3 ratio. Next, VPI-positive and VPI-negative groups were formed based on the pathological results. In the training set, the clinical information, the SUVmax, the relationship between the tumor and the pleura, and the CT features were analyzed using univariate analysis. The variables with significant differences were included in the multivariate analysis to construct a prediction model. A nomogram based on multivariate analysis was developed, and its predictive performance was verified in the validation set.RESULTSThe size of the solid component, the consolidation-to-tumor ratio, the solid component pleural contact length, the SUVmax, the density type, the pleural indentation, the spiculation, and the vascular convergence sign demonstrated significant differences between VPI-positive (n = 40) and VPI-negative (n = 58) cases on univariate analysis in the training set. A multivariate logistic regression model incorporated the SUVmax [odds ratio (OR): 1.753, P = 0.002], the solid component pleural contact length (OR: 1.101, P = 0.034), the pleural indentation (OR: 5.075, P = 0.041), and the vascular convergence sign (OR: 13.324, P = 0.025) as the best combination of predictors, which were all independent risk factors for VPI in the training group. The nomogram indicated promising discrimination, with an area under the curve value of 0.892 [95% confidence interval (CI), 0.813–0.946] in the training set and 0.885 (95% CI, 0.748–0.962) in the validation set. The calibration curve demonstrated that its predicted probabilities were in acceptable agreement with the actual probability. The decision curve analysis illustrated that the current nomogram would add more net benefit.CONCLUSIONThe nomogram integrating the SUVmax and the CT features could non-invasively predict VPI status before surgery in subpleural clinical stage IA lung adenocarcinoma patients

    HHMF: hidden hierarchical matrix factorization for recommender systems

    Get PDF
    Abstract(#br)Matrix factorization (MF) is one of the most powerful techniques used in recommender systems. MF models the (user, item) interactions behind historical explicit or implicit ratings. Standard MF does not capture the hierarchical structural correlations, such as publisher and advertiser in advertisement recommender systems, or the taxonomy (e.g., tracks, albums, artists, genres) in music recommender systems. There are a few hierarchical MF approaches, but they require the hierarchical structures to be known beforehand. In this paper, we propose a Hidden Hierarchical Matrix Factorization (HHMF) technique, which learns the hidden hierarchical structure from the user-item rating records. HHMF does not require the prior knowledge of hierarchical structure; hence, as opposed to..

    High-Throughput Sequencing of MicroRNAs in Adenovirus Type 3 Infected Human Laryngeal Epithelial Cells

    Get PDF
    Adenovirus infection can cause various illnesses depending on the infecting serotype, such as gastroenteritis, conjunctivitis, cystitis, and rash illness, but the infection mechanism is still unknown. MicroRNAs (miRNA) have been reported to play essential roles in cell proliferation, cell differentiation, and pathogenesis of human diseases including viral infections. We analyzed the miRNA expression profiles from adenovirus type 3 (AD3) infected Human laryngeal epithelial (Hep2) cells using a SOLiD deep sequencing. 492 precursor miRNAs were identified in the AD3 infected Hep2 cells, and 540 precursor miRNAs were identified in the control. A total of 44 miRNAs demonstrated high expression and 36 miRNAs showed lower expression in the AD3 infected cells than control. The biogenesis of miRNAs has been analyzed, and some of the SOLiD results were confirmed by Quantitative PCR analysis. The present studies may provide a useful clue for the biological function research into AD3 infection

    Ultra-low-dose spectral-detector computed tomography for the accurate quantification of pulmonary nodules: an anthropomorphic chest phantom study

    Get PDF
    PURPOSETo assess the quantification accuracy of pulmonary nodules using virtual monoenergetic images (VMIs) derived from spectral-detector computed tomography (CT) under an ultra-low-dose scan protocol.METHODSA chest phantom consisting of 12 pulmonary nodules was scanned using spectral-detector CT at 100 kVp/10 mAs, 100 kVp/20 mAs, 120 kVp/10 mAs, and 120 kVp/30 mAs. Each scanning protocol was repeated three times. Each CT scan was reconstructed utilizing filtered back projection, hybrid iterative reconstruction, iterative model reconstruction (IMR), and VMIs of 40–100 keV. The signal-to-noise ratio and air noise of images, absolute differences, and absolute percentage measurement errors (APEs) of the diameter, density, and volume of the four scan protocols and ten reconstruction images were compared.RESULTSWith each fixed reconstruction image, the four scanning protocols exhibited no significant differences in APEs for diameter and density (all P > 0.05). Of the four scan protocols and ten reconstruction images, APEs for nodule volume had no significant differences (all P > 0.05). At 100 kVp/10 mAs, APEs for density using IMR were the lowest (APE-mean: 6.69), but no significant difference was detected between VMIs at 50 keV (APE-mean: 11.69) and IMR (P = 0.666). In the subgroup analysis, at 100 kVp/10 mAs, there were no significant differences between VMIs at 50 keV and IMR in diameter and density (all P > 0.05). The radiation dose at 100 kVp/10 mAs was reduced by 77.8% compared with that at 120 kVp/30 mAs.CONCLUSIONCompared with IMR, reconstruction at 100 kVp/10 mAs and 50 keV provides a more accurate quantification of pulmonary nodules, and the radiation dose is reduced by 77.8% compared with that at 120 kVp/30 mAs, demonstrating great potential for ultra-low-dose spectral-detector CT

    Baseline whole-lung CT features deriving from deep learning and radiomics: prediction of benign and malignant pulmonary ground-glass nodules

    Get PDF
    ObjectiveTo develop and validate the model for predicting benign and malignant ground-glass nodules (GGNs) based on the whole-lung baseline CT features deriving from deep learning and radiomics.MethodsThis retrospective study included 385 GGNs from 3 hospitals, confirmed by pathology. We used 239 GGNs from Hospital 1 as the training and internal validation set; 115 and 31 GGNs from Hospital 2 and Hospital 3 as the external test sets 1 and 2, respectively. An additional 32 stable GGNs from Hospital 3 with more than five years of follow-up were used as the external test set 3. We evaluated clinical and morphological features of GGNs at baseline chest CT and extracted the whole-lung radiomics features simultaneously. Besides, baseline whole-lung CT image features are further assisted and extracted using the convolutional neural network. We used the back-propagation neural network to construct five prediction models based on different collocations of the features used for training. The area under the receiver operator characteristic curve (AUC) was used to compare the prediction performance among the five models. The Delong test was used to compare the differences in AUC between models pairwise.ResultsThe model integrated clinical-morphological features, whole-lung radiomic features, and whole-lung image features (CMRI) performed best among the five models, and achieved the highest AUC in the internal validation set, external test set 1, and external test set 2, which were 0.886 (95% CI: 0.841-0.921), 0.830 (95%CI: 0.749-0.893) and 0.879 (95%CI: 0.712-0.968), respectively. In the above three sets, the differences in AUC between the CMRI model and other models were significant (all P < 0.05). Moreover, the accuracy of the CMRI model in the external test set 3 was 96.88%.ConclusionThe baseline whole-lung CT features were feasible to predict the benign and malignant of GGNs, which is helpful for more refined management of GGNs

    Tundra microbial community taxa and traits predict decomposition parameters of stable, old soil organic carbon.

    Get PDF
    The susceptibility of soil organic carbon (SOC) in tundra to microbial decomposition under warmer climate scenarios potentially threatens a massive positive feedback to climate change, but the underlying mechanisms of stable SOC decomposition remain elusive. Herein, Alaskan tundra soils from three depths (a fibric O horizon with litter and course roots, an O horizon with decomposing litter and roots, and a mineral-organic mix, laying just above the permafrost) were incubated. Resulting respiration data were assimilated into a 3-pool model to derive decomposition kinetic parameters for fast, slow, and passive SOC pools. Bacterial, archaeal, and fungal taxa and microbial functional genes were profiled throughout the 3-year incubation. Correlation analyses and a Random Forest approach revealed associations between model parameters and microbial community profiles, taxa, and traits. There were more associations between the microbial community data and the SOC decomposition parameters of slow and passive SOC pools than those of the fast SOC pool. Also, microbial community profiles were better predictors of model parameters in deeper soils, which had higher mineral contents and relatively greater quantities of old SOC than in surface soils. Overall, our analyses revealed the functional potential of microbial communities to decompose tundra SOC through a suite of specialized genes and taxa. These results portray divergent strategies by which microbial communities access SOC pools across varying depths, lending mechanistic insights into the vulnerability of what is considered stable SOC in tundra regions
    corecore